Journals
  Publication Years
  Keywords
Search within results Open Search
Please wait a minute...
For Selected: Toggle Thumbnails
Multi-channel multi-step integration model for generative visual dialogue
Sihang CHEN, Aiwen JIANG, Zhaoyang CUI, Mingwen WANG
Journal of Computer Applications    2024, 44 (1): 39-46.   DOI: 10.11772/j.issn.1001-9081.2023010055
Abstract176)   HTML4)    PDF (3323KB)(121)       Save

Visual dialogue task has made significant progress in multimodal information fusion and inference. However, the ability of mainstream models is still limited when answering questions that involve relatively clear semantic attributes and spatial relationships. A relatively few mainstream models can explicitly provide fine-grained semantic representation of image content before formal response. There is a lack of necessary bridges to the semantic gap between visual feature representation and text semantics such as dialogue history and current questions. Therefore, a visual dialogue model based on Multi-Channel and Multi-step Integration (MCMI) was proposed to explicitly provide a set of fine-grained semantic description information about visual content. Through the interactions and multi-step integration among vision, semantics and dialogue history, the semantic representation of questions was enriched and more accurate decoded answers were achieved. On VisDial v0.9/VisDial v1.0 datasets, compared to Dual-channel Multi-hop Reasoning Model (DMRM), the proposed MCMI model improved Mean Reciprocal Ranking(MRR) by 1.95 and 2.12 percentage points respectively, Recall Rate (R@1) by 2.62 and 3.09 percentage points respectively, and Mean ranking of correct answers (Mean) by 0.88 and 0.99 respectively; On VisDial v1.0 dataset, compared to the latest Unified Transformer Contrastive learning model(UTC), MCMI model improved the MRR, R@1, Mean by 0.06 percentage points, 0.68 percentage points, and 1.47 respectively. In order to further evaluate the quality of generated dialogue, two subjective indicators are proposed. They are the Turing-test passing proportion M1 and the dialogue quality score (five point scale) M2. When compared with baseline model DMRM in the VisDial v0.9 dataset, MCMI model improved M1 by 9.00 percentage points and M2 by 0.70.

Table and Figures | Reference | Related Articles | Metrics
Multi-similarity K-nearest neighbor classification algorithm with ordered pairs of normalized real numbers
Haoyang CUI, Hui ZHANG, Lei ZHOU, Chunming YANG, Bo LI, Xujian ZHAO
Journal of Computer Applications    2023, 43 (9): 2673-2678.   DOI: 10.11772/j.issn.1001-9081.2022091376
Abstract271)   HTML14)    PDF (1618KB)(123)       Save

For the problems that the performance of the nearest neighbor classification algorithm is greatly affected by the adopted similarity or distance measuring method, and it is difficult to select the optimal similarity or distance measuring method, with multi-similarity method adopted, a K-Nearest Neighbor algorithm with Ordered Pairs of Normalized real numbers (OPNs-KNN) was proposed. Firstly, the new mathematical theory of Ordered Pair of Normalized real numbers (OPN) was introduced in machine learning. And all the samples in the training and test sets were converted into OPNs by multiple similarity or distance measuring methods, so that different similarity information was included in each OPN. Then, the improved nearest neighbor algorithm was used to classify the OPNs, so that different similarity or distance measuring methods were able to be mixed and complemented to improve the classification performance. Experimental results show that compared with 6 improved nearest neighbor classification algorithms, such as distance-Weighted K-Nearest-Neighbor rule (WKNN) rule on Iris, seeds, and other datasets, OPNs-KNN has the classification accuracy improved by 0.29 to 15.28 percentage points, which proves that the performance of classification can be improved greatly by the proposed algorithm.

Table and Figures | Reference | Related Articles | Metrics